Overview

Dataset statistics

Number of variables37
Number of observations8284
Missing cells159532
Missing cells (%)52.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 MiB
Average record size in memory296.0 B

Variable types

Numeric1
Text8
Categorical10
Unsupported18

Alerts

NIVEL has constant value ""Constant
Unnamed: 0.1 is highly overall correlated with DEPARTAMENTO and 2 other fieldsHigh correlation
DEPARTAMENTO is highly overall correlated with Unnamed: 0.1 and 2 other fieldsHigh correlation
JORNADA is highly overall correlated with PLANHigh correlation
PLAN is highly overall correlated with JORNADAHigh correlation
DEPARTAMENTAL is highly overall correlated with Unnamed: 0.1 and 2 other fieldsHigh correlation
ZONA is highly overall correlated with Unnamed: 0.1 and 2 other fieldsHigh correlation
SECTOR is highly imbalanced (64.2%)Imbalance
AREA is highly imbalanced (57.4%)Imbalance
STATUS is highly imbalanced (53.0%)Imbalance
MODALIDAD is highly imbalanced (81.6%)Imbalance
PLAN is highly imbalanced (55.4%)Imbalance
DISTRITO has 197 (2.4%) missing valuesMissing
TELEFONO has 1532 (18.5%) missing valuesMissing
SUPERVISOR has 198 (2.4%) missing valuesMissing
DIRECTOR has 1693 (20.4%) missing valuesMissing
 has 8284 (100.0%) missing valuesMissing
Unnamed: 1 has 8284 (100.0%) missing valuesMissing
Unnamed: 2 has 8284 (100.0%) missing valuesMissing
Unnamed: 3 has 8284 (100.0%) missing valuesMissing
Unnamed: 4 has 8284 (100.0%) missing valuesMissing
Unnamed: 5 has 8284 (100.0%) missing valuesMissing
Unnamed: 6 has 8284 (100.0%) missing valuesMissing
Unnamed: 7 has 8284 (100.0%) missing valuesMissing
Unnamed: 8 has 8284 (100.0%) missing valuesMissing
Unnamed: 9 has 8284 (100.0%) missing valuesMissing
Unnamed: 10 has 8284 (100.0%) missing valuesMissing
Unnamed: 11 has 8284 (100.0%) missing valuesMissing
Unnamed: 12 has 8284 (100.0%) missing valuesMissing
Unnamed: 13 has 8284 (100.0%) missing valuesMissing
Unnamed: 14 has 8284 (100.0%) missing valuesMissing
Unnamed: 15 has 8284 (100.0%) missing valuesMissing
Unnamed: 16 has 8284 (100.0%) missing valuesMissing
Unnamed: 0 has 8284 (100.0%) missing valuesMissing
ZONA has 6748 (81.5%) missing valuesMissing
Unnamed: 0.1 is uniformly distributedUniform
Unnamed: 0.1 has unique valuesUnique
CODIGO has unique valuesUnique
 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 1 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 13 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 14 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 15 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 16 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 0 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-07-31 23:39:51.088332
Analysis finished2023-07-31 23:39:56.873629
Duration5.79 seconds
Software versionydata-profiling vv4.3.2
Download configurationconfig.json

Variables

Unnamed: 0.1
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct8284
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4155.3445
Minimum0
Maximum8321
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size64.8 KiB
2023-07-31T17:39:57.042398image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile416.15
Q12074.75
median4153.5
Q36236.25
95-th percentile7902.85
Maximum8321
Range8321
Interquartile range (IQR)4161.5

Descriptive statistics

Standard deviation2402.7544
Coefficient of variation (CV)0.5782323
Kurtosis-1.2007245
Mean4155.3445
Median Absolute Deviation (MAD)2081
Skewness0.0029841373
Sum34422874
Variance5773228.7
MonotonicityStrictly increasing
2023-07-31T17:39:57.244007image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
5520 1
 
< 0.1%
5550 1
 
< 0.1%
5549 1
 
< 0.1%
5548 1
 
< 0.1%
5547 1
 
< 0.1%
5546 1
 
< 0.1%
5545 1
 
< 0.1%
5544 1
 
< 0.1%
5543 1
 
< 0.1%
Other values (8274) 8274
99.9%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
8321 1
< 0.1%
8320 1
< 0.1%
8319 1
< 0.1%
8318 1
< 0.1%
8317 1
< 0.1%
8316 1
< 0.1%
8315 1
< 0.1%
8314 1
< 0.1%
8313 1
< 0.1%
8312 1
< 0.1%

CODIGO
Text

UNIQUE 

Distinct8284
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
2023-07-31T17:39:57.729752image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters107692
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8284 ?
Unique (%)100.0%

Sample

1st row16-01-0138-46
2nd row16-01-0139-46
3rd row16-01-0140-46
4th row16-01-0141-46
5th row16-01-0142-46
ValueCountFrequency (%)
16-01-0138-46 1
 
< 0.1%
16-01-1174-46 1
 
< 0.1%
16-01-0141-46 1
 
< 0.1%
16-01-0142-46 1
 
< 0.1%
16-01-0143-46 1
 
< 0.1%
16-01-0145-46 1
 
< 0.1%
16-01-0147-46 1
 
< 0.1%
16-01-0150-46 1
 
< 0.1%
16-01-0155-46 1
 
< 0.1%
16-01-0428-46 1
 
< 0.1%
Other values (8274) 8274
99.9%
2023-07-31T17:39:58.350351image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 24852
23.1%
0 23490
21.8%
1 13145
12.2%
4 11824
11.0%
6 11251
10.4%
2 5431
 
5.0%
3 4279
 
4.0%
5 3660
 
3.4%
8 3381
 
3.1%
7 3263
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 82840
76.9%
Dash Punctuation 24852
 
23.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 23490
28.4%
1 13145
15.9%
4 11824
14.3%
6 11251
13.6%
2 5431
 
6.6%
3 4279
 
5.2%
5 3660
 
4.4%
8 3381
 
4.1%
7 3263
 
3.9%
9 3116
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 24852
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 107692
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 24852
23.1%
0 23490
21.8%
1 13145
12.2%
4 11824
11.0%
6 11251
10.4%
2 5431
 
5.0%
3 4279
 
4.0%
5 3660
 
3.4%
8 3381
 
3.1%
7 3263
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 107692
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 24852
23.1%
0 23490
21.8%
1 13145
12.2%
4 11824
11.0%
6 11251
10.4%
2 5431
 
5.0%
3 4279
 
4.0%
5 3660
 
3.4%
8 3381
 
3.1%
7 3263
 
3.0%

DISTRITO
Text

MISSING 

Distinct572
Distinct (%)7.1%
Missing197
Missing (%)2.4%
Memory size64.8 KiB
2023-07-31T17:39:58.914066image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.9877581
Min length3

Characters and Unicode

Total characters48423
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)0.8%

Sample

1st row16-031
2nd row16-031
3rd row16-031
4th row16-005
5th row16-005
ValueCountFrequency (%)
01-403 242
 
3.0%
11-017 175
 
2.2%
05-033 159
 
2.0%
01-411 150
 
1.9%
18-008 128
 
1.6%
01-409 102
 
1.3%
05-007 100
 
1.2%
18-039 98
 
1.2%
13-004 92
 
1.1%
03-002 91
 
1.1%
Other values (562) 6750
83.5%
2023-07-31T17:39:59.591807image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 14232
29.4%
1 9654
19.9%
- 8087
16.7%
2 3377
 
7.0%
3 3187
 
6.6%
4 2475
 
5.1%
6 1792
 
3.7%
5 1558
 
3.2%
9 1470
 
3.0%
7 1430
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40336
83.3%
Dash Punctuation 8087
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 14232
35.3%
1 9654
23.9%
2 3377
 
8.4%
3 3187
 
7.9%
4 2475
 
6.1%
6 1792
 
4.4%
5 1558
 
3.9%
9 1470
 
3.6%
7 1430
 
3.5%
8 1161
 
2.9%
Dash Punctuation
ValueCountFrequency (%)
- 8087
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 48423
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14232
29.4%
1 9654
19.9%
- 8087
16.7%
2 3377
 
7.0%
3 3187
 
6.6%
4 2475
 
5.1%
6 1792
 
3.7%
5 1558
 
3.2%
9 1470
 
3.0%
7 1430
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48423
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14232
29.4%
1 9654
19.9%
- 8087
16.7%
2 3377
 
7.0%
3 3187
 
6.6%
4 2475
 
5.1%
6 1792
 
3.7%
5 1558
 
3.2%
9 1470
 
3.0%
7 1430
 
3.0%

DEPARTAMENTO
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
GUATEMALA
2970 
ESCUINTLA
599 
HUEHUETENANGO
495 
QUETZALTENANGO
476 
PETEN
379 
Other values (14)
3365 

Length

Max length14
Median length13
Mean length9.7014727
Min length5

Characters and Unicode

Total characters80367
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowALTA VERAPAZ
2nd rowALTA VERAPAZ
3rd rowALTA VERAPAZ
4th rowALTA VERAPAZ
5th rowALTA VERAPAZ

Common Values

ValueCountFrequency (%)
GUATEMALA 2970
35.9%
ESCUINTLA 599
 
7.2%
HUEHUETENANGO 495
 
6.0%
QUETZALTENANGO 476
 
5.7%
PETEN 379
 
4.6%
SUCHITEPEQUEZ 377
 
4.6%
IZABAL 360
 
4.3%
CHIMALTENANGO 349
 
4.2%
ALTA VERAPAZ 348
 
4.2%
JUTIAPA 320
 
3.9%
Other values (9) 1611
19.4%

Length

2023-07-31T17:39:59.825907image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
guatemala 2970
33.5%
escuintla 599
 
6.8%
huehuetenango 495
 
5.6%
quetzaltenango 476
 
5.4%
verapaz 468
 
5.3%
peten 379
 
4.3%
suchitepequez 377
 
4.2%
izabal 360
 
4.1%
chimaltenango 349
 
3.9%
alta 348
 
3.9%
Other values (11) 2053
23.1%

Most occurring characters

ValueCountFrequency (%)
A 16901
21.0%
E 10750
13.4%
U 7627
9.5%
T 7590
9.4%
L 6167
 
7.7%
G 4412
 
5.5%
N 3798
 
4.7%
M 3491
 
4.3%
I 2681
 
3.3%
H 2441
 
3.0%
Other values (11) 14509
18.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 79777
99.3%
Space Separator 590
 
0.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 16901
21.2%
E 10750
13.5%
U 7627
9.6%
T 7590
9.5%
L 6167
 
7.7%
G 4412
 
5.5%
N 3798
 
4.8%
M 3491
 
4.4%
I 2681
 
3.4%
H 2441
 
3.1%
Other values (10) 13919
17.4%
Space Separator
ValueCountFrequency (%)
590
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 79777
99.3%
Common 590
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 16901
21.2%
E 10750
13.5%
U 7627
9.6%
T 7590
9.5%
L 6167
 
7.7%
G 4412
 
5.5%
N 3798
 
4.8%
M 3491
 
4.4%
I 2681
 
3.4%
H 2441
 
3.1%
Other values (10) 13919
17.4%
Common
ValueCountFrequency (%)
590
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80367
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 16901
21.0%
E 10750
13.4%
U 7627
9.5%
T 7590
9.4%
L 6167
 
7.7%
G 4412
 
5.5%
N 3798
 
4.7%
M 3491
 
4.3%
I 2681
 
3.3%
H 2441
 
3.0%
Other values (11) 14509
18.1%
Distinct271
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
2023-07-31T17:40:00.195602image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length28
Median length24
Mean length12.239256
Min length5

Characters and Unicode

Total characters101390
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)0.1%

Sample

1st rowCOBAN
2nd rowCOBAN
3rd rowCOBAN
4th rowCOBAN
5th rowCOBAN
ValueCountFrequency (%)
ciudad 1565
 
10.9%
capital 1536
 
10.7%
san 1231
 
8.5%
villa 455
 
3.2%
mixco 420
 
2.9%
nueva 400
 
2.8%
santa 354
 
2.5%
la 245
 
1.7%
quetzaltenango 241
 
1.7%
miguel 183
 
1.3%
Other values (282) 7782
54.0%
2023-07-31T17:40:00.793942image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 18704
18.4%
I 7496
 
7.4%
C 7049
 
7.0%
N 6753
 
6.7%
T 6696
 
6.6%
L 6568
 
6.5%
U 6330
 
6.2%
E 6223
 
6.1%
6128
 
6.0%
O 4485
 
4.4%
Other values (15) 24958
24.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 95262
94.0%
Space Separator 6128
 
6.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 18704
19.6%
I 7496
 
7.9%
C 7049
 
7.4%
N 6753
 
7.1%
T 6696
 
7.0%
L 6568
 
6.9%
U 6330
 
6.6%
E 6223
 
6.5%
O 4485
 
4.7%
S 4350
 
4.6%
Other values (14) 20608
21.6%
Space Separator
ValueCountFrequency (%)
6128
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 95262
94.0%
Common 6128
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 18704
19.6%
I 7496
 
7.9%
C 7049
 
7.4%
N 6753
 
7.1%
T 6696
 
7.0%
L 6568
 
6.9%
U 6330
 
6.6%
E 6223
 
6.5%
O 4485
 
4.7%
S 4350
 
4.6%
Other values (14) 20608
21.6%
Common
ValueCountFrequency (%)
6128
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 101390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 18704
18.4%
I 7496
 
7.4%
C 7049
 
7.0%
N 6753
 
6.7%
T 6696
 
6.6%
L 6568
 
6.5%
U 6330
 
6.2%
E 6223
 
6.1%
6128
 
6.0%
O 4485
 
4.4%
Other values (15) 24958
24.6%
Distinct4498
Distinct (%)54.3%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
2023-07-31T17:40:01.425305image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length125
Median length103
Mean length40.105142
Min length3

Characters and Unicode

Total characters332231
Distinct characters49
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2892 ?
Unique (%)34.9%

Sample

1st rowCOLEGIO COBAN
2nd rowCOLEGIO PARTICULAR MIXTO VERAPAZ
3rd rowCOLEGIO "LA INMACULADA"
4th rowESCUELA NACIONAL DE CIENCIAS COMERCIALES
5th rowINSTITUTO NORMAL MIXTO DEL NORTE 'EMILIO ROSALES PONCE'
ValueCountFrequency (%)
de 3297
 
7.6%
colegio 3225
 
7.4%
mixto 2525
 
5.8%
instituto 2258
 
5.2%
liceo 1578
 
3.6%
privado 1256
 
2.9%
educacion 1239
 
2.8%
centro 1103
 
2.5%
educativo 739
 
1.7%
diversificada 670
 
1.5%
Other values (2940) 25751
59.0%
2023-07-31T17:40:02.084616image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35381
10.6%
I 34859
10.5%
O 32860
9.9%
E 29369
 
8.8%
A 28154
 
8.5%
C 23203
 
7.0%
T 20646
 
6.2%
N 19098
 
5.7%
L 15791
 
4.8%
R 15148
 
4.6%
Other values (39) 77722
23.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 290708
87.5%
Space Separator 35381
 
10.6%
Other Punctuation 4655
 
1.4%
Dash Punctuation 743
 
0.2%
Decimal Number 354
 
0.1%
Open Punctuation 194
 
0.1%
Close Punctuation 193
 
0.1%
Modifier Symbol 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 34859
12.0%
O 32860
11.3%
E 29369
10.1%
A 28154
9.7%
C 23203
 
8.0%
T 20646
 
7.1%
N 19098
 
6.6%
L 15791
 
5.4%
R 15148
 
5.2%
D 12248
 
4.2%
Other values (16) 59332
20.4%
Decimal Number
ValueCountFrequency (%)
2 125
35.3%
0 70
19.8%
1 57
16.1%
3 33
 
9.3%
4 20
 
5.6%
7 17
 
4.8%
6 11
 
3.1%
8 7
 
2.0%
9 7
 
2.0%
5 7
 
2.0%
Other Punctuation
ValueCountFrequency (%)
" 2913
62.6%
' 841
 
18.1%
. 776
 
16.7%
, 106
 
2.3%
& 9
 
0.2%
/ 7
 
0.2%
% 2
 
< 0.1%
# 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
35381
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 743
100.0%
Open Punctuation
ValueCountFrequency (%)
( 194
100.0%
Close Punctuation
ValueCountFrequency (%)
) 193
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 290708
87.5%
Common 41523
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 34859
12.0%
O 32860
11.3%
E 29369
10.1%
A 28154
9.7%
C 23203
 
8.0%
T 20646
 
7.1%
N 19098
 
6.6%
L 15791
 
5.4%
R 15148
 
5.2%
D 12248
 
4.2%
Other values (16) 59332
20.4%
Common
ValueCountFrequency (%)
35381
85.2%
" 2913
 
7.0%
' 841
 
2.0%
. 776
 
1.9%
- 743
 
1.8%
( 194
 
0.5%
) 193
 
0.5%
2 125
 
0.3%
, 106
 
0.3%
0 70
 
0.2%
Other values (13) 181
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 332231
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35381
10.6%
I 34859
10.5%
O 32860
9.9%
E 29369
 
8.8%
A 28154
 
8.5%
C 23203
 
7.0%
T 20646
 
6.2%
N 19098
 
5.7%
L 15791
 
4.8%
R 15148
 
4.6%
Other values (39) 77722
23.4%
Distinct5351
Distinct (%)65.0%
Missing52
Missing (%)0.6%
Memory size64.8 KiB
2023-07-31T17:40:02.503963image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length110
Median length91
Mean length28.775632
Min length4

Characters and Unicode

Total characters236881
Distinct characters49
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3966 ?
Unique (%)48.2%

Sample

1st rowKM.2 SALIDA A SAN JUAN CHAMELCO ZONA 8
2nd rowKM 209.5 ENTRADA A LA CIUDAD
3rd row7A. AVENIDA 11-109 ZONA 6
4th row2A CALLE 11-10 ZONA 2
5th row3A AVE 6-23 ZONA 11
ValueCountFrequency (%)
zona 3695
 
8.3%
calle 2663
 
6.0%
avenida 2047
 
4.6%
1 1644
 
3.7%
barrio 1045
 
2.3%
colonia 1010
 
2.3%
aldea 915
 
2.1%
el 850
 
1.9%
san 787
 
1.8%
2 568
 
1.3%
Other values (3393) 29317
65.8%
2023-07-31T17:40:03.035112image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36309
15.3%
A 34290
14.5%
E 15070
 
6.4%
L 14795
 
6.2%
O 14178
 
6.0%
N 14105
 
6.0%
I 10781
 
4.6%
C 9205
 
3.9%
R 8585
 
3.6%
1 6494
 
2.7%
Other values (39) 73069
30.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 159703
67.4%
Space Separator 36309
 
15.3%
Decimal Number 28348
 
12.0%
Other Punctuation 7739
 
3.3%
Dash Punctuation 4728
 
2.0%
Lowercase Letter 22
 
< 0.1%
Open Punctuation 16
 
< 0.1%
Close Punctuation 16
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 34290
21.5%
E 15070
9.4%
L 14795
9.3%
O 14178
8.9%
N 14105
8.8%
I 10781
 
6.8%
C 9205
 
5.8%
R 8585
 
5.4%
D 5758
 
3.6%
T 5255
 
3.3%
Other values (16) 27681
17.3%
Decimal Number
ValueCountFrequency (%)
1 6494
22.9%
2 3926
13.8%
3 3398
12.0%
4 2885
10.2%
5 2650
9.3%
0 2340
 
8.3%
6 2052
 
7.2%
7 1739
 
6.1%
9 1439
 
5.1%
8 1425
 
5.0%
Other Punctuation
ValueCountFrequency (%)
. 4666
60.3%
, 2458
31.8%
" 469
 
6.1%
' 90
 
1.2%
/ 36
 
0.5%
# 19
 
0.2%
; 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 20
90.9%
o 2
 
9.1%
Space Separator
ValueCountFrequency (%)
36309
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4728
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 159725
67.4%
Common 77156
32.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 34290
21.5%
E 15070
9.4%
L 14795
9.3%
O 14178
8.9%
N 14105
8.8%
I 10781
 
6.7%
C 9205
 
5.8%
R 8585
 
5.4%
D 5758
 
3.6%
T 5255
 
3.3%
Other values (18) 27703
17.3%
Common
ValueCountFrequency (%)
36309
47.1%
1 6494
 
8.4%
- 4728
 
6.1%
. 4666
 
6.0%
2 3926
 
5.1%
3 3398
 
4.4%
4 2885
 
3.7%
5 2650
 
3.4%
, 2458
 
3.2%
0 2340
 
3.0%
Other values (11) 7302
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 236881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
36309
15.3%
A 34290
14.5%
E 15070
 
6.4%
L 14795
 
6.2%
O 14178
 
6.0%
N 14105
 
6.0%
I 10781
 
4.6%
C 9205
 
3.9%
R 8585
 
3.6%
1 6494
 
2.7%
Other values (39) 73069
30.8%

TELEFONO
Text

MISSING 

Distinct4112
Distinct (%)60.9%
Missing1532
Missing (%)18.5%
Memory size64.8 KiB
2023-07-31T17:40:03.320519image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters54016
Distinct characters16
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2764 ?
Unique (%)40.9%

Sample

1st row77945104
2nd row77367402
3rd row78232301
4th row79514215
5th row79521468
ValueCountFrequency (%)
22067425 21
 
0.3%
79480009 14
 
0.2%
22093200 12
 
0.2%
59304894 11
 
0.2%
45353648 11
 
0.2%
77746400 11
 
0.2%
22322912 10
 
0.1%
78899679 10
 
0.1%
24637777 10
 
0.1%
78394519 9
 
0.1%
Other values (4104) 6636
98.2%
2023-07-31T17:40:03.685153image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 7091
13.1%
7 6347
11.8%
4 6067
11.2%
5 5691
10.5%
3 5569
10.3%
8 4879
9.0%
0 4850
9.0%
6 4741
8.8%
1 4378
8.1%
9 4365
8.1%
Other values (6) 38
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 53978
99.9%
Dash Punctuation 20
 
< 0.1%
Other Punctuation 9
 
< 0.1%
Space Separator 7
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 7091
13.1%
7 6347
11.8%
4 6067
11.2%
5 5691
10.5%
3 5569
10.3%
8 4879
9.0%
0 4850
9.0%
6 4741
8.8%
1 4378
8.1%
9 4365
8.1%
Other Punctuation
ValueCountFrequency (%)
, 8
88.9%
/ 1
 
11.1%
Uppercase Letter
ValueCountFrequency (%)
E 1
50.0%
Y 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 20
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 54014
> 99.9%
Latin 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 7091
13.1%
7 6347
11.8%
4 6067
11.2%
5 5691
10.5%
3 5569
10.3%
8 4879
9.0%
0 4850
9.0%
6 4741
8.8%
1 4378
8.1%
9 4365
8.1%
Other values (4) 36
 
0.1%
Latin
ValueCountFrequency (%)
E 1
50.0%
Y 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 7091
13.1%
7 6347
11.8%
4 6067
11.2%
5 5691
10.5%
3 5569
10.3%
8 4879
9.0%
0 4850
9.0%
6 4741
8.8%
1 4378
8.1%
9 4365
8.1%
Other values (6) 38
 
0.1%

SUPERVISOR
Text

MISSING 

Distinct541
Distinct (%)6.7%
Missing198
Missing (%)2.4%
Memory size64.8 KiB
2023-07-31T17:40:03.974649image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length63
Median length44
Mean length29.060722
Min length14

Characters and Unicode

Total characters234985
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)0.7%

Sample

1st rowMERCEDES JOSEFINA TORRES GALVEZ
2nd rowMERCEDES JOSEFINA TORRES GALVEZ
3rd rowMERCEDES JOSEFINA TORRES GALVEZ
4th rowRUDY ADOLFO TOT OCH
5th rowRUDY ADOLFO TOT OCH
ValueCountFrequency (%)
de 1885
 
5.4%
martinez 572
 
1.6%
lopez 567
 
1.6%
leon 505
 
1.5%
gonzalez 481
 
1.4%
juan 447
 
1.3%
carlos 396
 
1.1%
morales 376
 
1.1%
hernandez 344
 
1.0%
humberto 324
 
0.9%
Other values (1016) 28830
83.0%
2023-07-31T17:40:04.370593image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 29680
12.6%
26641
11.3%
E 22143
 
9.4%
R 18664
 
7.9%
O 17841
 
7.6%
I 15684
 
6.7%
L 14776
 
6.3%
N 13747
 
5.9%
S 9561
 
4.1%
D 7868
 
3.3%
Other values (19) 58380
24.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 208214
88.6%
Space Separator 26641
 
11.3%
Dash Punctuation 124
 
0.1%
Other Punctuation 6
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 29680
14.3%
E 22143
10.6%
R 18664
 
9.0%
O 17841
 
8.6%
I 15684
 
7.5%
L 14776
 
7.1%
N 13747
 
6.6%
S 9561
 
4.6%
D 7868
 
3.8%
C 7740
 
3.7%
Other values (16) 50510
24.3%
Space Separator
ValueCountFrequency (%)
26641
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 124
100.0%
Other Punctuation
ValueCountFrequency (%)
. 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 208214
88.6%
Common 26771
 
11.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 29680
14.3%
E 22143
10.6%
R 18664
 
9.0%
O 17841
 
8.6%
I 15684
 
7.5%
L 14776
 
7.1%
N 13747
 
6.6%
S 9561
 
4.6%
D 7868
 
3.8%
C 7740
 
3.7%
Other values (16) 50510
24.3%
Common
ValueCountFrequency (%)
26641
99.5%
- 124
 
0.5%
. 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 234985
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 29680
12.6%
26641
11.3%
E 22143
 
9.4%
R 18664
 
7.9%
O 17841
 
7.6%
I 15684
 
6.7%
L 14776
 
6.3%
N 13747
 
5.9%
S 9561
 
4.1%
D 7868
 
3.3%
Other values (19) 58380
24.8%

DIRECTOR
Text

MISSING 

Distinct4096
Distinct (%)62.1%
Missing1693
Missing (%)20.4%
Memory size64.8 KiB
2023-07-31T17:40:04.652976image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Length

Max length57
Median length48
Mean length28.676225
Min length1

Characters and Unicode

Total characters189005
Distinct characters34
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2786 ?
Unique (%)42.3%

Sample

1st rowJULIO CESAR VILLELA AMADO
2nd rowVIRGINA SOLANO SERRANO
3rd rowHECOTR WALDEMAR TOT COY
4th rowLUIS FERNANDO SOTO
5th rowMERCEDES QUIROS QUIROS
ValueCountFrequency (%)
de 1262
 
4.5%
lopez 537
 
1.9%
garcia 337
 
1.2%
maria 327
 
1.2%
hernandez 311
 
1.1%
morales 275
 
1.0%
perez 240
 
0.9%
gonzalez 220
 
0.8%
jose 197
 
0.7%
ramirez 192
 
0.7%
Other values (3497) 23910
86.0%
2023-07-31T17:40:05.074907image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 25126
13.3%
21217
11.2%
E 18119
 
9.6%
R 15012
 
7.9%
O 13589
 
7.2%
I 12774
 
6.8%
L 11756
 
6.2%
N 11025
 
5.8%
S 7717
 
4.1%
D 6953
 
3.7%
Other values (24) 45717
24.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 167638
88.7%
Space Separator 21217
 
11.2%
Other Punctuation 76
 
< 0.1%
Dash Punctuation 71
 
< 0.1%
Math Symbol 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 25126
15.0%
E 18119
10.8%
R 15012
 
9.0%
O 13589
 
8.1%
I 12774
 
7.6%
L 11756
 
7.0%
N 11025
 
6.6%
S 7717
 
4.6%
D 6953
 
4.1%
C 6006
 
3.6%
Other values (16) 39561
23.6%
Other Punctuation
ValueCountFrequency (%)
. 69
90.8%
, 5
 
6.6%
" 2
 
2.6%
Space Separator
ValueCountFrequency (%)
21217
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 71
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 167638
88.7%
Common 21367
 
11.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 25126
15.0%
E 18119
10.8%
R 15012
 
9.0%
O 13589
 
8.1%
I 12774
 
7.6%
L 11756
 
7.0%
N 11025
 
6.6%
S 7717
 
4.6%
D 6953
 
4.1%
C 6006
 
3.6%
Other values (16) 39561
23.6%
Common
ValueCountFrequency (%)
21217
99.3%
- 71
 
0.3%
. 69
 
0.3%
, 5
 
< 0.1%
" 2
 
< 0.1%
+ 1
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 189005
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 25126
13.3%
21217
11.2%
E 18119
 
9.6%
R 15012
 
7.9%
O 13589
 
7.2%
I 12774
 
6.8%
L 11756
 
6.2%
N 11025
 
5.8%
S 7717
 
4.1%
D 6953
 
3.7%
Other values (24) 45717
24.2%

NIVEL
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
DIVERSIFICADO
8284 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters107692
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDIVERSIFICADO
2nd rowDIVERSIFICADO
3rd rowDIVERSIFICADO
4th rowDIVERSIFICADO
5th rowDIVERSIFICADO

Common Values

ValueCountFrequency (%)
DIVERSIFICADO 8284
100.0%

Length

2023-07-31T17:40:05.212975image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-31T17:40:05.326141image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
diversificado 8284
100.0%

Most occurring characters

ValueCountFrequency (%)
I 24852
23.1%
D 16568
15.4%
V 8284
 
7.7%
E 8284
 
7.7%
R 8284
 
7.7%
S 8284
 
7.7%
F 8284
 
7.7%
C 8284
 
7.7%
A 8284
 
7.7%
O 8284
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 107692
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 24852
23.1%
D 16568
15.4%
V 8284
 
7.7%
E 8284
 
7.7%
R 8284
 
7.7%
S 8284
 
7.7%
F 8284
 
7.7%
C 8284
 
7.7%
A 8284
 
7.7%
O 8284
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 107692
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 24852
23.1%
D 16568
15.4%
V 8284
 
7.7%
E 8284
 
7.7%
R 8284
 
7.7%
S 8284
 
7.7%
F 8284
 
7.7%
C 8284
 
7.7%
A 8284
 
7.7%
O 8284
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 107692
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 24852
23.1%
D 16568
15.4%
V 8284
 
7.7%
E 8284
 
7.7%
R 8284
 
7.7%
S 8284
 
7.7%
F 8284
 
7.7%
C 8284
 
7.7%
A 8284
 
7.7%
O 8284
 
7.7%

SECTOR
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
PRIVADO
7169 
OFICIAL
827 
COOPERATIVA
 
157
MUNICIPAL
 
131

Length

Max length11
Median length7
Mean length7.107436
Min length7

Characters and Unicode

Total characters58878
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRIVADO
2nd rowPRIVADO
3rd rowPRIVADO
4th rowOFICIAL
5th rowOFICIAL

Common Values

ValueCountFrequency (%)
PRIVADO 7169
86.5%
OFICIAL 827
 
10.0%
COOPERATIVA 157
 
1.9%
MUNICIPAL 131
 
1.6%

Length

2023-07-31T17:40:05.455015image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-31T17:40:05.624414image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
privado 7169
86.5%
oficial 827
 
10.0%
cooperativa 157
 
1.9%
municipal 131
 
1.6%

Most occurring characters

ValueCountFrequency (%)
I 9242
15.7%
A 8441
14.3%
O 8310
14.1%
P 7457
12.7%
R 7326
12.4%
V 7326
12.4%
D 7169
12.2%
C 1115
 
1.9%
L 958
 
1.6%
F 827
 
1.4%
Other values (5) 707
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 58878
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 9242
15.7%
A 8441
14.3%
O 8310
14.1%
P 7457
12.7%
R 7326
12.4%
V 7326
12.4%
D 7169
12.2%
C 1115
 
1.9%
L 958
 
1.6%
F 827
 
1.4%
Other values (5) 707
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 58878
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 9242
15.7%
A 8441
14.3%
O 8310
14.1%
P 7457
12.7%
R 7326
12.4%
V 7326
12.4%
D 7169
12.2%
C 1115
 
1.9%
L 958
 
1.6%
F 827
 
1.4%
Other values (5) 707
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 58878
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 9242
15.7%
A 8441
14.3%
O 8310
14.1%
P 7457
12.7%
R 7326
12.4%
V 7326
12.4%
D 7169
12.2%
C 1115
 
1.9%
L 958
 
1.6%
F 827
 
1.4%
Other values (5) 707
 
1.2%

AREA
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
URBANA
6818 
RURAL
1465 
SIN ESPECIFICAR
 
1

Length

Max length15
Median length6
Mean length5.8242395
Min length5

Characters and Unicode

Total characters48248
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowURBANA
2nd rowURBANA
3rd rowURBANA
4th rowURBANA
5th rowURBANA

Common Values

ValueCountFrequency (%)
URBANA 6818
82.3%
RURAL 1465
 
17.7%
SIN ESPECIFICAR 1
 
< 0.1%

Length

2023-07-31T17:40:05.769487image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-31T17:40:05.914157image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
urbana 6818
82.3%
rural 1465
 
17.7%
sin 1
 
< 0.1%
especificar 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A 15102
31.3%
R 9749
20.2%
U 8283
17.2%
N 6819
14.1%
B 6818
14.1%
L 1465
 
3.0%
I 3
 
< 0.1%
S 2
 
< 0.1%
E 2
 
< 0.1%
C 2
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 48247
> 99.9%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 15102
31.3%
R 9749
20.2%
U 8283
17.2%
N 6819
14.1%
B 6818
14.1%
L 1465
 
3.0%
I 3
 
< 0.1%
S 2
 
< 0.1%
E 2
 
< 0.1%
C 2
 
< 0.1%
Other values (2) 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48247
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 15102
31.3%
R 9749
20.2%
U 8283
17.2%
N 6819
14.1%
B 6818
14.1%
L 1465
 
3.0%
I 3
 
< 0.1%
S 2
 
< 0.1%
E 2
 
< 0.1%
C 2
 
< 0.1%
Other values (2) 2
 
< 0.1%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 15102
31.3%
R 9749
20.2%
U 8283
17.2%
N 6819
14.1%
B 6818
14.1%
L 1465
 
3.0%
I 3
 
< 0.1%
S 2
 
< 0.1%
E 2
 
< 0.1%
C 2
 
< 0.1%
Other values (3) 3
 
< 0.1%

STATUS
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
ABIERTA
5945 
CERRADA TEMPORALMENTE
2225 
TEMPORAL TITULOS
 
111
TEMPORAL NOMBRAMIENTO
 
3

Length

Max length21
Median length7
Mean length10.885925
Min length7

Characters and Unicode

Total characters90179
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowABIERTA
2nd rowABIERTA
3rd rowABIERTA
4th rowABIERTA
5th rowABIERTA

Common Values

ValueCountFrequency (%)
ABIERTA 5945
71.8%
CERRADA TEMPORALMENTE 2225
 
26.9%
TEMPORAL TITULOS 111
 
1.3%
TEMPORAL NOMBRAMIENTO 3
 
< 0.1%

Length

2023-07-31T17:40:06.026358image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-31T17:40:06.155658image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
abierta 5945
56.0%
cerrada 2225
 
20.9%
temporalmente 2225
 
20.9%
temporal 114
 
1.1%
titulos 111
 
1.0%
nombramiento 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A 18682
20.7%
E 14962
16.6%
R 12737
14.1%
T 10734
11.9%
I 6059
 
6.7%
B 5948
 
6.6%
M 4570
 
5.1%
O 2456
 
2.7%
L 2450
 
2.7%
2339
 
2.6%
Other values (6) 9242
10.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 87840
97.4%
Space Separator 2339
 
2.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 18682
21.3%
E 14962
17.0%
R 12737
14.5%
T 10734
12.2%
I 6059
 
6.9%
B 5948
 
6.8%
M 4570
 
5.2%
O 2456
 
2.8%
L 2450
 
2.8%
P 2339
 
2.7%
Other values (5) 6903
 
7.9%
Space Separator
ValueCountFrequency (%)
2339
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 87840
97.4%
Common 2339
 
2.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 18682
21.3%
E 14962
17.0%
R 12737
14.5%
T 10734
12.2%
I 6059
 
6.9%
B 5948
 
6.8%
M 4570
 
5.2%
O 2456
 
2.8%
L 2450
 
2.8%
P 2339
 
2.7%
Other values (5) 6903
 
7.9%
Common
ValueCountFrequency (%)
2339
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90179
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 18682
20.7%
E 14962
16.6%
R 12737
14.1%
T 10734
11.9%
I 6059
 
6.7%
B 5948
 
6.6%
M 4570
 
5.1%
O 2456
 
2.7%
L 2450
 
2.7%
2339
 
2.6%
Other values (6) 9242
10.2%

MODALIDAD
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
MONOLINGUE
8053 
BILINGUE
 
231

Length

Max length10
Median length10
Mean length9.9442298
Min length8

Characters and Unicode

Total characters82378
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMONOLINGUE
2nd rowMONOLINGUE
3rd rowMONOLINGUE
4th rowMONOLINGUE
5th rowBILINGUE

Common Values

ValueCountFrequency (%)
MONOLINGUE 8053
97.2%
BILINGUE 231
 
2.8%

Length

2023-07-31T17:40:06.292287image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-31T17:40:06.437729image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
monolingue 8053
97.2%
bilingue 231
 
2.8%

Most occurring characters

ValueCountFrequency (%)
N 16337
19.8%
O 16106
19.6%
I 8515
10.3%
L 8284
10.1%
G 8284
10.1%
U 8284
10.1%
E 8284
10.1%
M 8053
9.8%
B 231
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 82378
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 16337
19.8%
O 16106
19.6%
I 8515
10.3%
L 8284
10.1%
G 8284
10.1%
U 8284
10.1%
E 8284
10.1%
M 8053
9.8%
B 231
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 82378
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 16337
19.8%
O 16106
19.6%
I 8515
10.3%
L 8284
10.1%
G 8284
10.1%
U 8284
10.1%
E 8284
10.1%
M 8053
9.8%
B 231
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 82378
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 16337
19.8%
O 16106
19.6%
I 8515
10.3%
L 8284
10.1%
G 8284
10.1%
U 8284
10.1%
E 8284
10.1%
M 8053
9.8%
B 231
 
0.3%

JORNADA
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
DOBLE
2767 
VESPERTINA
2174 
MATUTINA
2172 
SIN JORNADA
816 
NOCTURNA
 
274

Length

Max length11
Median length10
Mean length7.8378803
Min length5

Characters and Unicode

Total characters64929
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMATUTINA
2nd rowMATUTINA
3rd rowMATUTINA
4th rowMATUTINA
5th rowVESPERTINA

Common Values

ValueCountFrequency (%)
DOBLE 2767
33.4%
VESPERTINA 2174
26.2%
MATUTINA 2172
26.2%
SIN JORNADA 816
 
9.9%
NOCTURNA 274
 
3.3%
INTERMEDIA 81
 
1.0%

Length

2023-07-31T17:40:06.541349image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-31T17:40:06.686927image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
doble 2767
30.4%
vespertina 2174
23.9%
matutina 2172
23.9%
sin 816
 
9.0%
jornada 816
 
9.0%
nocturna 274
 
3.0%
intermedia 81
 
0.9%

Most occurring characters

ValueCountFrequency (%)
A 8505
13.1%
E 7277
11.2%
T 6873
10.6%
N 6607
10.2%
I 5324
 
8.2%
O 3857
 
5.9%
D 3664
 
5.6%
R 3345
 
5.2%
S 2990
 
4.6%
L 2767
 
4.3%
Other values (8) 13720
21.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 64113
98.7%
Space Separator 816
 
1.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 8505
13.3%
E 7277
11.4%
T 6873
10.7%
N 6607
10.3%
I 5324
8.3%
O 3857
 
6.0%
D 3664
 
5.7%
R 3345
 
5.2%
S 2990
 
4.7%
L 2767
 
4.3%
Other values (7) 12904
20.1%
Space Separator
ValueCountFrequency (%)
816
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 64113
98.7%
Common 816
 
1.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 8505
13.3%
E 7277
11.4%
T 6873
10.7%
N 6607
10.3%
I 5324
8.3%
O 3857
 
6.0%
D 3664
 
5.7%
R 3345
 
5.2%
S 2990
 
4.7%
L 2767
 
4.3%
Other values (7) 12904
20.1%
Common
ValueCountFrequency (%)
816
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64929
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 8505
13.1%
E 7277
11.2%
T 6873
10.6%
N 6607
10.2%
I 5324
 
8.2%
O 3857
 
5.9%
D 3664
 
5.6%
R 3345
 
5.2%
S 2990
 
4.6%
L 2767
 
4.3%
Other values (8) 13720
21.1%

PLAN
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
DIARIO(REGULAR)
4988 
FIN DE SEMANA
2197 
SEMIPRESENCIAL (FIN DE SEMANA)
 
424
SEMIPRESENCIAL (UN DIA A LA SEMANA)
 
317
A DISTANCIA
 
130
Other values (8)
 
228

Length

Max length37
Median length15
Mean length16.039957
Min length5

Characters and Unicode

Total characters132875
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDIARIO(REGULAR)
2nd rowDIARIO(REGULAR)
3rd rowDIARIO(REGULAR)
4th rowDIARIO(REGULAR)
5th rowDIARIO(REGULAR)

Common Values

ValueCountFrequency (%)
DIARIO(REGULAR) 4988
60.2%
FIN DE SEMANA 2197
26.5%
SEMIPRESENCIAL (FIN DE SEMANA) 424
 
5.1%
SEMIPRESENCIAL (UN DIA A LA SEMANA) 317
 
3.8%
A DISTANCIA 130
 
1.6%
SEMIPRESENCIAL 75
 
0.9%
SEMIPRESENCIAL (DOS DIAS A LA SEMANA) 52
 
0.6%
VIRTUAL A DISTANCIA 41
 
0.5%
SABATINO 40
 
0.5%
DOMINICAL 14
 
0.2%
Other values (3) 6
 
0.1%

Length

2023-07-31T17:40:06.814723image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
diario(regular 4988
31.2%
semana 2990
18.7%
fin 2621
16.4%
de 2621
16.4%
semipresencial 868
 
5.4%
a 540
 
3.4%
la 369
 
2.3%
un 317
 
2.0%
dia 317
 
2.0%
distancia 171
 
1.1%
Other values (8) 205
 
1.3%

Most occurring characters

ValueCountFrequency (%)
A 18585
14.0%
R 15881
12.0%
I 15159
11.4%
E 13207
9.9%
D 8217
 
6.2%
7723
 
5.8%
N 7023
 
5.3%
L 6284
 
4.7%
( 5781
 
4.4%
) 5781
 
4.4%
Other values (12) 29234
22.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 113590
85.5%
Space Separator 7723
 
5.8%
Open Punctuation 5781
 
4.4%
Close Punctuation 5781
 
4.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 18585
16.4%
R 15881
14.0%
I 15159
13.3%
E 13207
11.6%
D 8217
7.2%
N 7023
 
6.2%
L 6284
 
5.5%
U 5348
 
4.7%
O 5098
 
4.5%
S 5041
 
4.4%
Other values (9) 13747
12.1%
Space Separator
ValueCountFrequency (%)
7723
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5781
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5781
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 113590
85.5%
Common 19285
 
14.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 18585
16.4%
R 15881
14.0%
I 15159
13.3%
E 13207
11.6%
D 8217
7.2%
N 7023
 
6.2%
L 6284
 
5.5%
U 5348
 
4.7%
O 5098
 
4.5%
S 5041
 
4.4%
Other values (9) 13747
12.1%
Common
ValueCountFrequency (%)
7723
40.0%
( 5781
30.0%
) 5781
30.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 132875
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 18585
14.0%
R 15881
12.0%
I 15159
11.4%
E 13207
9.9%
D 8217
 
6.2%
7723
 
5.8%
N 7023
 
5.3%
L 6284
 
4.7%
( 5781
 
4.4%
) 5781
 
4.4%
Other values (12) 29234
22.0%

DEPARTAMENTAL
Categorical

HIGH CORRELATION 

Distinct23
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size64.8 KiB
GUATEMALA NORTE
1037 
GUATEMALA SUR
796 
GUATEMALA OCCIDENTE
774 
ESCUINTLA
599 
HUEHUETENANGO
495 
Other values (18)
4583 

Length

Max length19
Median length14
Mean length12.155843
Min length5

Characters and Unicode

Total characters100699
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowALTA VERAPAZ
2nd rowALTA VERAPAZ
3rd rowALTA VERAPAZ
4th rowALTA VERAPAZ
5th rowALTA VERAPAZ

Common Values

ValueCountFrequency (%)
GUATEMALA NORTE 1037
 
12.5%
GUATEMALA SUR 796
 
9.6%
GUATEMALA OCCIDENTE 774
 
9.3%
ESCUINTLA 599
 
7.2%
HUEHUETENANGO 495
 
6.0%
QUETZALTENANGO 476
 
5.7%
PETEN 379
 
4.6%
SUCHITEPEQUEZ 377
 
4.6%
GUATEMALA ORIENTE 363
 
4.4%
IZABAL 360
 
4.3%
Other values (13) 2628
31.7%

Length

2023-07-31T17:40:06.982603image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
guatemala 2970
25.0%
norte 1084
 
9.1%
sur 796
 
6.7%
occidente 774
 
6.5%
escuintla 599
 
5.0%
huehuetenango 495
 
4.2%
quetzaltenango 476
 
4.0%
verapaz 468
 
3.9%
peten 379
 
3.2%
suchitepequez 377
 
3.2%
Other values (15) 3473
29.2%

Most occurring characters

ValueCountFrequency (%)
A 16901
16.8%
E 14108
14.0%
T 9811
9.7%
U 8423
 
8.4%
L 6167
 
6.1%
N 6019
 
6.0%
G 4412
 
4.4%
O 3965
 
3.9%
I 3818
 
3.8%
C 3782
 
3.8%
Other values (12) 23293
23.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 97092
96.4%
Space Separator 3607
 
3.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 16901
17.4%
E 14108
14.5%
T 9811
10.1%
U 8423
8.7%
L 6167
 
6.4%
N 6019
 
6.2%
G 4412
 
4.5%
O 3965
 
4.1%
I 3818
 
3.9%
C 3782
 
3.9%
Other values (11) 19686
20.3%
Space Separator
ValueCountFrequency (%)
3607
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 97092
96.4%
Common 3607
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 16901
17.4%
E 14108
14.5%
T 9811
10.1%
U 8423
8.7%
L 6167
 
6.4%
N 6019
 
6.2%
G 4412
 
4.5%
O 3965
 
4.1%
I 3818
 
3.9%
C 3782
 
3.9%
Other values (11) 19686
20.3%
Common
ValueCountFrequency (%)
3607
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 100699
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 16901
16.8%
E 14108
14.0%
T 9811
9.7%
U 8423
 
8.4%
L 6167
 
6.1%
N 6019
 
6.0%
G 4412
 
4.4%
O 3965
 
3.9%
I 3818
 
3.8%
C 3782
 
3.8%
Other values (12) 23293
23.1%


Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 1
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 2
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 3
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 4
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 13
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 14
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 15
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 16
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

Unnamed: 0
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing8284
Missing (%)100.0%
Memory size64.8 KiB

ZONA
Categorical

HIGH CORRELATION  MISSING 

Distinct21
Distinct (%)1.4%
Missing6748
Missing (%)81.5%
Memory size64.8 KiB
ZONA 1
628 
ZONA 7
173 
ZONA 12
114 
ZONA 18
102 
ZONA 6
71 
Other values (16)
448 

Length

Max length7
Median length6
Mean length6.3255208
Min length6

Characters and Unicode

Total characters9716
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowZONA 1
2nd rowZONA 1
3rd rowZONA 1
4th rowZONA 1
5th rowZONA 1

Common Values

ValueCountFrequency (%)
ZONA 1 628
 
7.6%
ZONA 7 173
 
2.1%
ZONA 12 114
 
1.4%
ZONA 18 102
 
1.2%
ZONA 6 71
 
0.9%
ZONA 11 62
 
0.7%
ZONA 2 54
 
0.7%
ZONA 19 53
 
0.6%
ZONA 13 46
 
0.6%
ZONA 3 40
 
0.5%
Other values (11) 193
 
2.3%
(Missing) 6748
81.5%

Length

2023-07-31T17:40:07.110728image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
zona 1536
50.0%
1 628
20.4%
7 173
 
5.6%
12 114
 
3.7%
18 102
 
3.3%
6 71
 
2.3%
11 62
 
2.0%
2 54
 
1.8%
19 53
 
1.7%
13 46
 
1.5%
Other values (12) 233
 
7.6%

Most occurring characters

ValueCountFrequency (%)
Z 1536
15.8%
O 1536
15.8%
N 1536
15.8%
A 1536
15.8%
1536
15.8%
1 1188
12.2%
2 202
 
2.1%
7 193
 
2.0%
8 107
 
1.1%
6 89
 
0.9%
Other values (5) 257
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6144
63.2%
Decimal Number 2036
 
21.0%
Space Separator 1536
 
15.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1188
58.3%
2 202
 
9.9%
7 193
 
9.5%
8 107
 
5.3%
6 89
 
4.4%
3 86
 
4.2%
9 81
 
4.0%
5 44
 
2.2%
0 27
 
1.3%
4 19
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
Z 1536
25.0%
O 1536
25.0%
N 1536
25.0%
A 1536
25.0%
Space Separator
ValueCountFrequency (%)
1536
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6144
63.2%
Common 3572
36.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1536
43.0%
1 1188
33.3%
2 202
 
5.7%
7 193
 
5.4%
8 107
 
3.0%
6 89
 
2.5%
3 86
 
2.4%
9 81
 
2.3%
5 44
 
1.2%
0 27
 
0.8%
Latin
ValueCountFrequency (%)
Z 1536
25.0%
O 1536
25.0%
N 1536
25.0%
A 1536
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9716
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Z 1536
15.8%
O 1536
15.8%
N 1536
15.8%
A 1536
15.8%
1536
15.8%
1 1188
12.2%
2 202
 
2.1%
7 193
 
2.0%
8 107
 
1.1%
6 89
 
0.9%
Other values (5) 257
 
2.6%

Interactions

2023-07-31T17:39:54.495777image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-07-31T17:40:07.215511image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Unnamed: 0.1DEPARTAMENTOSECTORAREASTATUSMODALIDADJORNADAPLANDEPARTAMENTALZONA
Unnamed: 0.11.0000.7580.0840.1750.1160.1410.1110.0880.8050.891
DEPARTAMENTO0.7581.0000.1390.2100.1270.3100.1260.1121.0001.000
SECTOR0.0840.1391.0000.1200.0750.1090.1360.1330.1440.225
AREA0.1750.2100.1201.0000.0340.1110.0810.0670.2240.265
STATUS0.1160.1270.0750.0341.0000.0200.1600.1350.1340.126
MODALIDAD0.1410.3100.1090.1110.0201.0000.0810.0780.3150.000
JORNADA0.1110.1260.1360.0810.1600.0811.0000.5620.1330.093
PLAN0.0880.1120.1330.0670.1350.0780.5621.0000.1190.065
DEPARTAMENTAL0.8051.0000.1440.2240.1340.3150.1330.1191.0000.994
ZONA0.8911.0000.2250.2650.1260.0000.0930.0650.9941.000

Missing values

2023-07-31T17:39:54.928224image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-31T17:39:55.893928image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-07-31T17:39:56.627938image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0.1CODIGODISTRITODEPARTAMENTOMUNICIPIOESTABLECIMIENTODIRECCIONTELEFONOSUPERVISORDIRECTORNIVELSECTORAREASTATUSMODALIDADJORNADAPLANDEPARTAMENTALUnnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16Unnamed: 0ZONA
0016-01-0138-4616-031ALTA VERAPAZCOBANCOLEGIO COBANKM.2 SALIDA A SAN JUAN CHAMELCO ZONA 877945104MERCEDES JOSEFINA TORRES GALVEZJULIO CESAR VILLELA AMADODIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEMATUTINADIARIO(REGULAR)ALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
1116-01-0139-4616-031ALTA VERAPAZCOBANCOLEGIO PARTICULAR MIXTO VERAPAZKM 209.5 ENTRADA A LA CIUDAD77367402MERCEDES JOSEFINA TORRES GALVEZNaNDIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEMATUTINADIARIO(REGULAR)ALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
2216-01-0140-4616-031ALTA VERAPAZCOBANCOLEGIO "LA INMACULADA"7A. AVENIDA 11-109 ZONA 678232301MERCEDES JOSEFINA TORRES GALVEZVIRGINA SOLANO SERRANODIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEMATUTINADIARIO(REGULAR)ALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3316-01-0141-4616-005ALTA VERAPAZCOBANESCUELA NACIONAL DE CIENCIAS COMERCIALES2A CALLE 11-10 ZONA 279514215RUDY ADOLFO TOT OCHNaNDIVERSIFICADOOFICIALURBANAABIERTAMONOLINGUEMATUTINADIARIO(REGULAR)ALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
4416-01-0142-4616-005ALTA VERAPAZCOBANINSTITUTO NORMAL MIXTO DEL NORTE 'EMILIO ROSALES PONCE'3A AVE 6-23 ZONA 1179521468RUDY ADOLFO TOT OCHNaNDIVERSIFICADOOFICIALURBANAABIERTABILINGUEVESPERTINADIARIO(REGULAR)ALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
5516-01-0143-4616-031ALTA VERAPAZCOBANCOLEGIO PARTICULAR MIXTO IMPERIAL5A. CALLE 1-9 ZONA 357101061MERCEDES JOSEFINA TORRES GALVEZHECOTR WALDEMAR TOT COYDIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEDOBLEFIN DE SEMANAALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
6616-01-0145-4616-006ALTA VERAPAZCOBANINSTITUTO DE TURSMO Y AVIACON DEL NORTE I.T.A.N3 AV. 5-28 ZONA 454641454EFRAIN CAAL CUCLUIS FERNANDO SOTODIVERSIFICADOPRIVADOURBANACERRADA TEMPORALMENTEMONOLINGUEMATUTINADIARIO(REGULAR)ALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
7716-01-0147-4616-031ALTA VERAPAZCOBANCOLEGIO "LA INMACULADA"7A. CALLE 11-09 ZONA 6 COBAN49532425MERCEDES JOSEFINA TORRES GALVEZMERCEDES QUIROS QUIROSDIVERSIFICADOPRIVADORURALCERRADA TEMPORALMENTEMONOLINGUEDOBLEDIARIO(REGULAR)ALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8816-01-0150-4616-006ALTA VERAPAZCOBANINSTITUTO INTERCULTRUAL ALTAVERAPACENCESE -IIAV-3A. AVAENIDA 1-23 ZONA 4NaNEFRAIN CAAL CUCGUILLERMO ESTUARDO VASQUEZ MORALESDIVERSIFICADOPRIVADOURBANACERRADA TEMPORALMENTEBILINGUEDOBLEFIN DE SEMANAALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
9916-01-0155-4616-031ALTA VERAPAZCOBANLICEO "MODERNO LATINO"11 AVENIDA 5-17 ZONA 479522555MERCEDES JOSEFINA TORRES GALVEZJORGE BENEDICTO COC POPDIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEDOBLEFIN DE SEMANAALTA VERAPAZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
Unnamed: 0.1CODIGODISTRITODEPARTAMENTOMUNICIPIOESTABLECIMIENTODIRECCIONTELEFONOSUPERVISORDIRECTORNIVELSECTORAREASTATUSMODALIDADJORNADAPLANDEPARTAMENTALUnnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12Unnamed: 13Unnamed: 14Unnamed: 15Unnamed: 16Unnamed: 0ZONA
8274831203-14-1249-4603-008SACATEPEQUEZALOTENANGOCOLEGIO SAN JUAN ALOTENANGO2DO. CANTONNaNJUAN DEMETRIO SICAJOL PEREZNaNDIVERSIFICADOPRIVADOURBANACERRADA TEMPORALMENTEMONOLINGUEDOBLEFIN DE SEMANASACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8275831303-15-0011-4603-008SACATEPEQUEZSAN ANTONIO AGUAS CALIENTESCENTRO EDUCATIVO MIXTO GUATEMALTECO CESAR BRANAS2A. AVENIDA A 1-13 ZONA 250894387JUAN DEMETRIO SICAJOL PEREZLUIS ALONZO NAVAS RIVERADIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEDOBLEFIN DE SEMANASACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8276831403-15-0013-4603-008SACATEPEQUEZSAN ANTONIO AGUAS CALIENTESCENTRO EDUCATIVO MIXTO GUATEMALTECO CESAR BRANAS2A. AVENIDA A 1-13 ZONA 252672187JUAN DEMETRIO SICAJOL PEREZLUIS ALONZO NAVAS RIVERADIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEDOBLEFIN DE SEMANASACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8277831503-15-0016-4603-008SACATEPEQUEZSAN ANTONIO AGUAS CALIENTESCENTRO EDUCATIVO MIXTO GUATEMALTECO "CESAR BRANAS"SEGUNDA AVENIDA A 1-13 ZONA 279439866JUAN DEMETRIO SICAJOL PEREZLUIS ALONZO NAVAS RIVERADIVERSIFICADOPRIVADOURBANAABIERTAMONOLINGUEDOBLEFIN DE SEMANASACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8278831603-15-0019-4603-008SACATEPEQUEZSAN ANTONIO AGUAS CALIENTESINSTITUTO NACIONAL DE EDUCACION DIVERSIFICADA6TA. CALLE FINAL 4-81 ZONA 244021935JUAN DEMETRIO SICAJOL PEREZEDY ROMUALDO PEREZ SINAYDIVERSIFICADOOFICIALURBANAABIERTAMONOLINGUEVESPERTINADIARIO(REGULAR)SACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8279831703-15-0022-4603-008SACATEPEQUEZSAN ANTONIO AGUAS CALIENTESINSTITUTO NACIONAL DE EDUCACION DIVERSIFICADA6TA. CALLE FINAL 4-81, ZONA 2NaNJUAN DEMETRIO SICAJOL PEREZEDY PEREZ SINAYDIVERSIFICADOOFICIALURBANAABIERTAMONOLINGUEDOBLEDIARIO(REGULAR)SACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8280831803-16-0005-4603-008SACATEPEQUEZSANTA CATARINA BARAHONAINSTITUTO MUNICIPAL DE EDUCACION BASICA Y BACHILLERATO EN CIENCIAS Y LETRAS POR MADUREZCALLE PRINCIPAL42645553JUAN DEMETRIO SICAJOL PEREZOSCAR HUMBERTO HERNANDEZ LAZARODIVERSIFICADOMUNICIPALURBANACERRADA TEMPORALMENTEMONOLINGUENOCTURNADIARIO(REGULAR)SACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8281831903-16-0006-4603-008SACATEPEQUEZSANTA CATARINA BARAHONAINSTITUTO MUNICIPAL DE EDUCACION BASICA Y BACHILLERATO EN CIENCIAS Y LETRAS POR MADUREZCALLE PRINCIPAL45707357JUAN DEMETRIO SICAJOL PEREZCLAUDIA JEANETH CHAVEZ FLORESDIVERSIFICADOMUNICIPALURBANAABIERTAMONOLINGUEDOBLEFIN DE SEMANASACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8282832003-16-0010-4603-008SACATEPEQUEZSANTA CATARINA BARAHONAINSTITUTO MUNICIPAL DE EDUCACION BASICA Y BACHILLERATO POR MADUREZCALLE PRINCIPAL SANTA CATARINA BARAHONA41575081JUAN DEMETRIO SICAJOL PEREZCLAUDIA JEANETH CHAVEZ FLORESDIVERSIFICADOMUNICIPALURBANAABIERTAMONOLINGUESIN JORNADASEMIPRESENCIAL (FIN DE SEMANA)SACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8283832103-16-1442-4603-008SACATEPEQUEZSANTA CATARINA BARAHONAINSTITUTO MUNICIPAL DE EDUCACION BASICA Y BACHILLERATO POR MADUREZCALLE PRINCIPALNaNJUAN DEMETRIO SICAJOL PEREZNaNDIVERSIFICADOMUNICIPALURBANACERRADA TEMPORALMENTEMONOLINGUEDOBLEFIN DE SEMANASACATEPEQUEZNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN